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ABSTRACT 

Surveys for gravitational lens systems have typically found a significantly 
larger fraction of lenses with four (or more) images than are predicted by stan- 



^ I dard ellipsoidal lens models (50% versus 25-30%). We show that including the 

' effects of smaller satellite galaxies, with an abundance normalized by the observa- 

o ■ 



tions, significantly increases the expected number of systems with more than two 
images and largely explains the discrepancy. The effect is dominated by satel- 
lites with ~ 20% the luminosity of the primary lens, in rough agreement with the 
^ I typical luminosities of the observed satellites. We find that the lens systems with 

satellites cannot be dropped from estimates of the cosmological model based on 
gravitational lens statistics without significantly biasing the results. 



1. Introduction 

While many gravitational lenses can be modeled as single, isolated, massive galaxies, 
this can only be an approximation. Both the luminosity functions of galaxies and the mass 
functions of halos derived from hierarchical structure formation predict that massive galax- 
ies are likely to be surrounded by lower mass halos. These halos modify the gravitational 
potential from that of the central (usually) elliptical galaxy and have several observational 
consequences. First, the satellite galaxies observed in many systems (e.g. MG0414+0534, 
Schechter & Moore (1993); B1030+074, Xanthopoulos et al. (1998); B1152+199, Myers et 
al. (1999), Rusin et al. (2002); and B1359+154, Myers et al. (1999) for the JVAS/CLASS 
sample) are required to obtain a successful model of the observed image geometries. The 
gravitational field produced by an offset nearby satellite is essentially impossible to mimic 
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through variations in the structure of the central lens galaxy or the addition of tidal shears. 
Second, the satellite galaxies can change the caustic structure of the lens, possibly helping to 
explain the relatively high numbers of observed quad lenses (e.g. Kochanek & Apostolakis 
(1988), Rusin & Tegmark, (2001), Moller & Blain (2001)). Third, even very low mass satel- 
lites can perturb the magnification tensors of individual images so as to produce patterns of 
image magnifications which cannot be reproduced by the central lens galaxy (Mao & Schnei- 
der (1998), Metcalf & Madau (2001), Chiba (2002), Dalai & Kochanek (2002), Schechter 
& Wambsganss (2002), Keeton (2003), Keeton, Gaudi, Fetters (2003), Kochanek & Dalai 
(2002; 2003), Evans & Witt (2003)). 

In this paper we focus on the second of these effects, the role of substructure in changing 
the expected numbers of images, in particular the ratio of lenses with 2 or 4 images. Standard 
statistical models consisting of an isolated elliptical galaxy in an external tidal field have 
difficulty reproducing the observed ratio of quad to double lenses. In particular, 50% (10 
of 20: 10 doubles, 9 quads and 1 with six images) of the published lenses found in the 
JVAS/CLASS surveys (Patnaik et al. (1992), Browne et al. (1998; 2002), Wilkinson et 
al (1998), King et al. (1999), Myers et al. (1995; 1999; 2003)) for lensed fiat-spectrum 
radio sources have more than two images, while the most recent models predict only 24- 
31% (Rusin & Tegmark, (2001), hereafter RT). The significance of this "quads to doubles 
ratio" problem has risen and fallen as the lens sample has grown (see King & Browne (1996), 
Kochanek (1996b), Keeton, Kochanek & Seljak (1997), Finch et al. (2002)), but the ratio 
has always seemed uncomfortably high. Most solutions, other than invocations of bad luck, 
such as the effects of core radii, high dark matter ellipticities or local tidal fields, are not very 
successful at changing the ratio without becoming either physically implausible or predicting 
large numbers of other unobserved image configurations. 

Compound lenses have been addressed previously within different approximations. Kochanek 
& Apostolakis ((1988)) examined the caustic/multiplicity and magnification structure for two 
lenses of the same mass at differing separations and redshifts. They focused on spherical 
lenses but indicated modifications which would be introduced by cllipticity as well. Seitz & 
Schneider ((1992)) considered multiplicities and magnifications for compound lenses. Moller 
& Blain ((2001)) calculated probabilities and some characteristic properties for lenses at two 
different redshifts but they did not consider quantitatively the role of the correlation function 
in enhancing the numbers of satellites clustered with the primary lens. RT made a limited 
study of the effects of adding faint SIS (singular isothermal sphere) satellites, concluding that 
low mass neighbors would have little effect. Here we extend their work to include higher 
mass neighbors as well (in a sequence of lens mass ratios), the correlation function of galaxies 
and the number density of satellites. In §2 we outline our calculation. In §3 we discuss the 
results and how they change when we vary some of our (standard) assumptions. In §4 we 
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summarize the conclusions and outline the issues needing further study. 



2. Methodology 



We will study the statistical consequences of including a satellite galaxy on the lensing 
properties of a more massive primary lens. In order to separate the effects of the clustering 
from the lensing properties of the two halos in isolation, we focus on estimates of the "excess" 
lensing cross sections and probabilities created by the clustering. In §2.1 we outline our 
method for calculating the cross sections, and mathematically define the excess lensing cross 
sections and probabilities. In §2.2 we determine the normalization of the density of satellites 
needed to match the observed numbers of satellites in the JVAS/CLASS sample. Finally, in 
§2.3 we define the lens models and methods used in our calculation. 



We label the primary lens and the satellite by their critical radii in isolation, bo and 
bi. The satellite is always taken to be less massive than the primary, bi < bo- We will use 
singular isothermal spheres (SIS) for our mass distribution, so the critical radii are related to 
the velocity dispersion a of the lens galaxy by 6 = 47r{a / {D is / Dos) ■ The quantities Dql, 
Dls and Dqs are comoving distances between the Observer, Lens and Source redshifts. For 
simplicity we place the lens at the median redshift of the observed CLASS lenses, zi — 0.63, 
and the source, Zg — 1.6, at twice the distance corresponding to the lens redshift in an 
Qo = 0.3 flat cosmological model. For an L* galaxy with cr* = 220 km/s this means that 
6o = 0'/7. This corresponds to a comoving separation of 5.5/i~^ kpc at the lens redshift. For 
the lens model we adopt, the redshifts have little consequence for the results. 

The distribution of satellite galaxies around the primary lens has two parts. There 
are satellites correlated (near) the primary lens and uncorrelated satellites projected along 
the line of sight. We assume that the luminosity or velocity dispersion distribution of the 
satellites is the same for both correlated and uncorrelated satellites. For ease of calculation 
we will project the uncorrelated satellites into an effective distribution at the redshift of 
the primary lens. In doing so we will neglect the lensing of the background galaxy by the 
foreground galaxy, but this should be relatively unimportant given the dominant role of the 
correlated galaxies at small separations. The comoving density of sateUites of luminosity L 
is described by 



2.1. Excess Lensing Cross Sections and Probabilities 




(1) 
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The expression has two parts. The first part, (1 + ^(r)), describes the density of galaxies as a 
function of the comoving distance r from the primary lens. We model the three-dimensional 
correlation function by ^(r) = {r/ro)^"' with 7 ~ 1.8 and a comoving correlation scale of 
To — 5/i^^ Mpc (e.g. Peacock (2001)). The second part is a Schechter (1976) luminosity 
function for the galaxies characterized by comoving density n*, an exponent which we will 
assume to be a = —1 and a luminosity scale L*. In order to convert from luminosity to 
deflection 6, we use the Faber-Jackson (1976) relation L/L^, = [a/a^Y ^'^^^ ^* — 220 km/s to 
convert from luminosity to velocity dispersion. As some of the satellite galaxies might be later 
type galaxies with smaller mass-to-light ratios than the primary lens, the use of the Faber- 
Jackson relation may overestimate the masses of satellites. We want an expression in terms 
of the effective deflection at the rcdshift of the primary lens, and this must take into account 
the redshift dependence of the deflection. In particular, a galaxy producing deflection 61 at 
the redshift of the primary lens produces a deflection 62 = if shifted to rcdshift 

^2 ^ a perturber of the same velocity dispersion but lower (higher) redshift produces a larger 
(smaller) deflection. After making the change of variables we flnd that 

= (1 + ^{r)) ^e-(^?/^^)(^-/^-)' (2) 
dV dbi bi 

where hi is the deflection at the rcdshift of the primary lens, r is the comoving distance from 
the primary lens, 6* = A7i{a^/cyDis/ Dqs is the deflection produced by an galaxy at the 
redshift of the primary lens, and the factor of Dig/ D2S provides the shift in the effective 
deflection scale when the redshift of the sateUite differs significantly from the primary. In 
§2.2 we will derive n* by fitting the observed frequency of satellites in the JVAS/CLASS lens 
sample. We will consider an alternate model, based on the mass function of halos in cold 
dark matter (CDM) simulations in §3. 

The next step is to convert from a comoving volume density to a projected surface 
density per unit solid angle by integrating along the line of sight, where dV — DQ2dDo2d^ 
and D02 is the comoving distance to the redshift of the sateUite. The correlated term is simple 
because the correlation length is small enough to ignore the changes in the deflection scale, 
i.e. D1S/D2S ~ 1- The three-dimensional correlation function is replaced by its projection 

/oo 
dz^ir) = 3.7ro(i?c/ro)"°-' (3) 
-00 

where Rc is the projected comoving distance from the primary lens. It is related to the proper 
distance R and the angular distance ^ by i? = OD'^^ — R^ (1+-Zi) where D'^^ — Dqx/ {l+zi) 
is the angular diameter distance. Thus, the contribution from the correlated galaxies is 

' ^ B„\6(BJ^e-«/«). (4) 



y nnr ■'- 
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The integral over the uncorrelated sateUites is not trivially analytic because of the distance 
factors in the exponential, 



{-^^ -3^W./Me-^»='- (5) 

\ "'"1/ uncorr ^ 

where F{u) = 3exp(M2) x^rfx exp(-MV4(l - xf ), F{0) = 1 and F(l) = 0.29. We derived 
the integral using the fact that -D15/-D25 = 1/2(1 — for x = D02/D0S in a flat universe 
with the primary lens midway between the observer and the source. To 4% accuracy we can 
approximate the integral by F{u) ~ 0.978 - 2.033m + 2.245^^ - 0.901^^ for < m < 1. The 
ratio of the uncorrelated to the correlated surface density is 

uncorr ^^^^ Do,F{u) / i^ey '^ .g. 
corr To \fo J ' 

so the correlated density dominates the surface density on projected scales Rc/bo,c < F{u)~^^'^ > 
5 for L* galaxies, where 6o,c is the comoving length scale corresponding to the deflection bo 
of the primary lens. We combine these to get the total projected density of satellites per 
unit comoving area dAc (or solid angle dfl) 

= [URc) + S^F{b,/b,)]^ ^ ' 

Integrating over the distributions, the fraction of primary lenses having a correlated 
satellite with b^/i < bi < b^. within a projected comoving radius of Rc is 

0.028 (o Qi^aMpc-^) (lO/i-'kpc) (s/i-^Mpc) ' 
while the fraction having an uncorrelated satellite is 

U - 0.012 ( ( ,7Ti§r-:l ' • (9) 



^O.Ol/i^Mpc-V VlO^"^kpc 

The correlated satellites dominate on scales Rc < 30h~^ kpc. 

An isolated lens with critical radius b has a cross section for producing n visible images 
of a„(6) and a total cross section atotip) from the sum of these cross sections. For a spherical 
lens, only 172(6) is non-zero. In theory, our pair of lenses can produce multiple image systems 
with m = 3, 5 or 7 images. However, either 1 or 2 of the images are trapped in the cores 
of the lenses, strongly demagnifled and hence invisible to an observer. For example, in the 
m — 7 image systems, two images are always trapped in the cores to leave only n — 5 



-6- 



visible images. If we characterize the configurations by the total number of images m and 
the number of visible images n (those not trapped in a core), then lenses are produced with 
m/n = 3/2, 3/3, 5/3, 5/4 and 7/5. We will keep track only of the numbers of visible images 
n. For simplicity we did not separate the 3/3 and 3/2 systems. Some 3/3 systems were 
due to our use of cores; the remaining number were a negligible fraction of the 3/2 systems. 
Thus, a pair of lenses with critical radii of bo and bi < bo separated by projected distance R 
have a cross section for producing n visible images of (7„(6o, R) and a total cross section 
of atot{bo,bi,R). 

The probability of observing a lens must include the effects of magnification bias (e.g. 
Turner, Ostriker, Gott (1984)), as magnification due to lensing brings more objects into a 
survey sample. If the sources have a flux distribution dn/dF, then the magnification bias 
factor is 

dP dM dn f F 



B{F) 



dF^ ^ 



(10) 



dM M dF 

given the probability distribution dP/dM of the image magnifications. We assume a fiux 
distribution of 

^(F)ocF-("+^). (11) 

where a = 1.1 for CLASS, as described in RT. So, associated with each cross section 
(e.g. an{bo,bi, R)) is a magnification bias factor (i?„(6o, 6i, i?)) and the probability of ob- 
serving a lens is the product of the two factors, Pn{bo, bi,R) = an{bo, &i, R)Bn{bQ, bi,R). For 
a single isolated lens we have Pn{b) = an{b)Bn{b). 

We are interested in the "excess" lensing cross section or probability created by having 
lenses which are not isolated. We define the excess cross section associated with a satellite 
bi at impact parameter R, 

(7e,n(feo, bi, R) = (7„(6o, R) " (7n(&o) " (7„(6i) (12) 

as the difference between the cross section produced by the combined lenses and that pro- 
duced by the same lenses in isolation. Similarly we define the excess lensing probability 

Pe,nibo, 61, R) = Pn{bo, R) ' Pn(&o) ' Pn{bl) (13) 

as the difference between the lensing probability of the combined lenses and the isolated 
lenses. By summing over the different image multiplicities n we obtain the total excess cross 
section a^^tot and probability Pe,tot- These quantities can be computed for any lens model 
and for any sateUite distribution, although our labehng scheme is tied to our subsequent 
use of SIS lens models. To estimate the significance of the results, we need only examine 
the fractional changes in the optical depth or probability relative to the more massive lens, 

(7e,„(&0,&l,-R)/(7n(&o) and Pe,n(&0, -R)/-Pn(&o)- 
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2.2. Normalizing the Satellite Model 

The key factor in determining the consequences of satelhte galaxies is their absolute 
number, which we parameterize through the value of n*. For normal galaxy luminosity 
functions, the comoving density is n^. ~ 0.01/i^ Mpc~^ (e.g. Loveday (2000), Kochanek et al 
(2001)), although the exact value for use in calculation depends on galaxy type definitions. 
For our present purposes, we adopted an empirical approach of estimating n* from the 
observed numbers of lenses with satellite galaxies. This approach has the advantage of 
avoiding questions about galaxy types, the dependence of the correlation length on galaxy 
types or mass and the extrapolation of the correlation function from large (Mpc) to small 
(kpc) scales. The simplest approximation is that the number of observed systems has little 
dependence on the presence of satellites. We outline this calculation, and then describe how 
we included the weak dependence (see §3) we did find. 

For each lens we search a region R < Rum around the lens for satellites with critical 
radii between that of the primary lens 6o ^-nd a limiting critical radius (i.e. deflection) bum. 
These detection thresholds in turn correspond to a luminosity range Lq = (6o/&*)^ > Lgat > 
Liim = ipiim/b^Y. Given our model for the luminosity function, the expected number of 
satellites in system i is where 

n,Ei = 27r / RJR^ / dh—— (14) 

is determined by the geometry and the luminosity ratios. In practice the integral over h 
will also include a contribution from the fact that increasing h increases the total number of 
lenses. We correct for this by adding a cross section- weighting, replacing dn/db dn/db{l + 
■31om6^ + 0-47o^&^) in Eqn. (14) above. 

We estimate by maximizing the Poisson likehhood for the observed satellites. For 
Niens systems in which system i has observed satellites inside the detection limits set by 
Riim and bn^, the likelihood of the observations is 

N'lens 

log L = {-n^Ei + Ni In n*) (15) 

i=l 

plus constant terms. If the total number of observed satellites is Nsat, the maximum likeli- 
hood estimate for the density is 



sat 



'N, 

* Lens 



(16) 
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with statistical uncertainties 5n*/n^ — l/\/2Nsat- With the correction for the increased cross 
section mentioned above, the calculations become slightly non-linear but are easily solved 
by iteration. 

We performed the calculation for the combined JVAS/CLASS sample of 20 lenses avail- 
able in the literature. We dropped 5 systems: three for which HST imaging data is absent 
(B0128+437, B0445+123 and B0850+054); one where the primary lens has not been identi- 
fied (B1555+375), and one in which the lens geometry is not understood (B21144-022). We 
searched each lens out to a radius Rum of 2'.'0. Of the remaining 15 systems, 5 have satellites 
within 2'.'0 of the primary lens (MG0414+0534, B1030+071, B1152+200, B1359+154, and 
B1608+434) which are visible in the HST images and necessary components for a successful 
lens model. There are six sateUites in total because B1359-I-154 has two satellite galaxies 
inside its Einstein ring. A detailed observational analysis of the satellite selection function 
is beyond the scope of our present analysis, so we simply explored a plausible range of se- 
lection thresholds. For selection thresholds of bi = 0'.'05 (6 satellites), (X'lO (5 satellites) and 
0'.'20 (5 satellites), we found that ra, = (0.016 ± 0m6)h^ Mpc'^, (0.019 ± 0.009)/i3 Mpc^^ 
and (0.032 ih 0.015)/i^ Mpc~^ respectively. We will adopt the most conservative estimate, 
= 0.016/i^ Mpc~^, as our fiducial value. For small changes in n*, the results of §3 for the 
expected numbers of lenses can be scaled linearly with n*. 



We model the lenses as singular isothermal spheres (SIS) characterized by a critical 
radius b and a core radius s, p cc b{r'^ + s^)~^. The core radius s = 0.016 is introduced only 
to avoid numerical singularities. The comoving scale of this core is thus always less than 
55 pc; in addition, the effects of these core radii on the cross section and bias tend to 
cancel out while the additional changes which depend on the lens separations are suppressed 
by our use of the lens sample itself to normahze the model. Thus these models very closely 
approximate SIS models, and for simplicity we will refer to them as SIS models. 

For a two-dimensional lensing potential (p such that = 2fi;(r), the surface density 
of the model in units of the critical density is 



2.3. 



Lens Models and Calculations 




b 1 



(17) 



2 ys^+r^ 



We work in units where the critical radius of the primary lens b^ = 1. We consider satellites 
with critical radii, relative to bo, of 61/60 = 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1.0. Since the 
outer grid scales must be held fixed, we eventually lack the resolution needed to resolve the 
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cores of the low mass satellites (61 < 0.3), forcing us extrapolate the cross sections to smaller 
masses. Since the masses roughly scale as 6^, we probe mass ratios from 1:1 to roughly 1:11. 

For each image configuration we solve the lens equations 



to determine the image positions Xj corresponding to a grid of source positions u. The image 
magnification |M| is analytically derived from the inverse magnification tensor, 



Since we generally dealt with two close potentials, we neglected any external (tidal) shears 
from other sources. While such perturbations are expected in all lenses (e.g. Kovner (1987), 
Bar-Kana (1996)), and seem to be necessary components of any realistic lens model (Keeton, 
Kochanek & Seljak (1997)), the perturbations due to the two satellites are generally large 
enough to neglect those from more distant objects. However, neglecting the tidal fields from 
larger scales does force us to truncate some integrals as we discuss in detail later on. 

We perform the calculation using a variant of the triangle tessellation method (e.g. 
Blandford & Kochanek (1987)) on a 9000^ image plane grid and a 1700^ source plane grid 
covering the multiple image regions of the lens and source planes. The image plane and source 
plane grid spacings were 0.0056o- The cross sections are the number of points in the source 
plane which have a given image multiplicity and the magnifications are the projected area of 
the image plane into the source plane for triangles enclosing source plane points of interest. 
Given our grid resolution, our magnification statistics become poor for magnifications larger 
than 100. We simply truncated the magnification probability distributions at this limit, 
after testing to see that using a higher limit would have little quantitative impact on the 
results. These cross sections and magnifications were combined with the luminosity functions 
for the CLASS survey, an estimate of the number of satellites as a function of mass, and 
the correlation function with the satellites in order to calculate the change in lensing cross 
sections. 

We simulate the procedures of the CLASS radio survey. To be selected for further study, 
the lens has to have at least one image pair with a separation larger than 0'.'3 and a flux ratio 
smaller than 15:1. Neither of these criterion has an enormous effect on the detectability of 
our model lenses, although for low satellite masses (i.e. when bi < 0'/15 = 0.226o), we detect 
only the images dominated by the larger deflection scale of the primary lens. These selection 
functions have little effect; we lose only 1% of the lenses for large satellites and about 4% 
of lenses for 61/60 = 0.5. We assumed that all lenses selected for follow-up studies would 



u = X — V0(x) 



(18) 




(19) 
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Impact parameter (R/bf,) Impact parameter (R/bf,) 

Fig. 1. — (Left) The excess cross section (Te,n(&Oj bi, R) for finding n — 2, S, 4, and 5 images as 
a function of impact parameter R/bo for 61/60 = 0.5. (Right) The excess lensing probabihty 
Pe,n{bo, 61, R) for finding n = 2, 3, 4 and 5 images as a function of impact parameter R/bQ for 
61/60 = 0.5. In both, the solid line dipping below zero refers to doubles, triples are denoted 
with a dashed line, quads with a solid line, and fives with a dot-dashed line. 
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undergo VLBI observations that could detect all images with separations larger than Of!01 
and flux ratios within 100:1 of the brightest image. Ignoring the images in the lens cores, 
we used this criterion to set the number of visible images n we used to classify the lenses. 

If the lenses are well separated, the images are associated with one or the other lens 
and the effects of the companion are well-approximated by an external shear perturbation 
of amplitude 7 = b/2R. Once this shear perturbation becomes smaller than other sources of 
shear on the lens (the ellipticity of the lens or the shears from more distant halos), a model 
based on two circular lenses becomes unrealistic. Thus, we truncate our radial integrations 
once the induced shears drop below a level of 7 = 0.025. For satellites massive enough to 
produce detectable images (61 > O'.'IS), we set the cutoff radius based on the shear induced 
by the primary lens on the satellite. For lower mass satellites, where the CLASS survey 
could not resolve multiple images generated by the satellite, we set the cutoff radius based 
on the shear produced by the satellite on the primary. No matter the exact definitions, our 
effective cross sections will have a discontinuity as the mass of the satellite becomes large 
enough to produce directly detectable image separations. 

A sateUite with a mass too small to produce detectable image separations is also too 
small for us to simulate (due to numerical resolution limitations). We thus had to estimate 
how many of the lenses we found for larger satellite masses would actually be detectable 
for smaller satellite masses. The way we did this was to assume that all the lens systems 
associated solely with the satellites would be undetectable if twice the satellite Einstein 
radius, a characteristic lens separation, was below the CLASS survey resolution. "Associated 
solely with the satellites" means that for instance the two lenses are widely separated and 
all the images are located around the satellite. We estimated the number of images which 
would be solely associated with the primary and then extrapolated this number to low 
satellite masses, where the images associated solely with the primary are the only images 
visible. There were two ways of estimating these "primary only" images. First, we took only 
sources within 3.4 bo of the primary once the lenses were well separated (varying this cutoff 
distance from 2.4 bo to 4 bo was a small effect). This distance was chosen because for all 
satellite masses of interest the caustics of the two lenses would not overlap at this separation. 
Secondly, we took only lenses which had at least one image on the "far side" of the primary 
lens in the primary-secondary configuration. We fit the resulting excesses in both cases as a 
function of 61, extrapolated the results for the region with hi < 0.226o, and then used this 
in our final estimates. The two approaches lead to differences of only 2% in our estimates of 
the quad lens fraction, with fewer quads found when we cut the sample based on the source 
position rather than the image position. For the remainder of the paper, we used the source 
plane cuts because they give the more conservative (i.e. smaller) effect. 
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0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 

satellite critical radius b,/b„ satellite critical radius b,/b„ 

Fig. 2. — (Left) The excess cross section cre,n(&0) bi)n^ for finding n = 2, 3, 4 and 5 images, 
as in equation 20. (Right) The excess lensing probabihty Pe,n(&o, &i)w* for finding n — 2, 3, 
4 and 5 images as in equation 20. The dashed straight hues connect the smallest calculated 
value to the origin, as the excess should disappear as bi — > 0. In both, the solid line dipping 
below zero refers to doubles, triples are denoted with filled triangles, quads with a solid line, 
and fives with open circles. 
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0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 

satellite critical radius b^/bg satellite critical radius bj/b^ 



Fig. 3. — (Left) The excess cross section (Te,n(&Oi ^i)(2/&) exp(— 6^/6^) for finding n = 2, 
3, 4 and 5 images, left hand side of equation 20. The dotted hnes connect the lowest 
calculated points from simulations with the origin. (Right) The excess lensing probability 
Pe,n{t>Q, 6i)(2/6) exp(— 6^/6^) for finding n = 2, 3, 4 and 5 images, left hand side of equation 
20. For bi < 0.226o the angular selection function makes it difficult to see the lenses produced 
by the low mass lens at large separations. This reduces the number of images seen and also 
requires that we truncate our integrals at a separation dictated by the shear on the primary 
due to the secondary rather than vice versa. The fit to the extrapolated truncation of this 
quantity (counting images associated with primary only) causes the noticeable break in the 
curves once the image separations associated with the satellite are too small to be seen in 
the CLASS survey. In both, the solid line dipping below zero refers to doubles, triples are 
denoted with filled triangles, quads with a solid line, and fives with open circles. 
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3. Results 

We first consider the scalings expected for our standard model, and tlien briefly con- 
sider the results for alternative assumptions for the halo mass function, correlation function, 
and halo mass profile. To provide a sense of the expected numbers, the fraction of image 
configurations with n — A visible images produced by an isolated lens with an axis ratio 
of 0.7, typical of early-type galaxies, is 3% based on the cross sections, and 21% based on 
the lensing probabilities. The fraction is higher when based on the probabilities because the 
n = 4 image systems have higher magnifications and thus larger magnification bias factors 
than the n = 2 image systems. This estimate roughly corresponds to results of calculations 
which average over the ellipticity distributions of early-type galaxies (e.g. Kochanek (1996b), 
Keeton, Kochanek & Seljak (1997), RT, Chae (2003)). Recall that the observed fraction of 
lenses with four or more images in the published JVAS/CLASS sample is 50% (9 quads and 
a sextuple in a sample of 20 lenses), so sateUites must roughly double the quad lens fraction 
if they are to be a significant part of the solution to this problem. 

In Figures 1-3 we illustrate the various excess cross sections and probabilities. In Fig. 1 
we show the excess cross section and probability as a function of impact parameter (separa- 
tion) R for a lens with bi — 0.56o- As the two lenses are brought together, the combined lens 
produces more n — A image lenses and fewer n — 2 image lenses as the mutually induced 
tidal shear converts some of the n — 2 image cross section into n — 4 image cross section 
(and some n — 3 and 5 cross section). The depression of the n — 2 image cross section is 
largest at "resonance" where the center of each lens galaxy is projected to the same source 
point and there is a peak in the cross section for producing additional images (see Kochanek 
& Apostolakis (1988)). As the lenses become still closer, the induced ellipticity diminishes 
and there is a net excess of n = 2 image systems. 

Note that the excess probability, the lower panel in Fig. 1, has a different asymptotic 
slope at large impact parameters from the excess cross section because of the effects of 
magnification bias. The shear produced at lens 1 by lens at separation is 7 = 6o/2-R, 
which results in a tidally induced n = 4 image cross section proportional to 6^7^ ~ hy^l/AR^. 
The average magnification of these images diverges as M ~ 7^^, so the magnification bias 
grows like B R°' where a ~ 1.1 is the slope of the number counts (see Kochanek (1996a)). 
As a result, the excess probability decreases only as rather than the R~'^ scaling of the 
cross section at large impact parameter. This very slow ~ R~^ convergence of the excess 
n — A image probability can lead to diverging total probabilities if the integral over the 
impact parameters is not truncated. The natural truncation scale is the radius at which 
the induced shear is comparable to the typical intrinsic ellipticity of the lens galaxy or the 
typical tidal shear, because it is on this scale that the integrals for (more complete) models 
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with additional sources for shear would rapidly converge to the results for isolated lenses. 
For a tidal shear 7^ the average cross section would be oc 61 (7^ + bl/{AB?)) and the average 
magnification would be oc (7^ + 6Q/(4i?^))~^/^ leading to rapid convergence of the excess 
probability to zero once 7^ ~ bo/2R. 

Fig. 2 shows the result of integrating these functions over the satellite density distribu- 
tion 



F{hi/h^) for the uncorrelated terms so that the correlated and uncorrelated terms are directly 
comparable (i.e. they will both be multiplied by ^^^^ to get the full contribution). On a per 
satellite basis, massive satellites dominate the excess cross section and probability, just as 
they dominate the cross sections of isolated lenses. If we integrate over the correlation func- 
tion with the radial integral truncated at an inner radius oi R — hQ (roughly corresponding 
to the resonance region where the production of 5 image systems peaks), then the induced 
correlated 5- image cross section divided by '^^^^ is oc n-irQ%}^%\. The uncorrelated term 
also is proportional to h\ -|- 0((6i/6*)^), so the combination roughly matches the shape of 
the curves in Fig. 3. 

These distributions exaggerate the importance of the more massive satellites because 
they are not weighted by the relative abundances of high and low mass systems. Low mass 
systems are more abundant, due to the additional required factor and so are more likely 

to be close to the critical radius of the primary lens and produce significant perturbations. 
Fig. 3 shows the effect of including this factor. For example, the product of the uncorrelated 
cross section (oc 61) with the Schechter function leads to an overall scaling hi exp{—bl/bl) 
that leads to the rise and then the fall in the excess cross sections shown in Fig. 3. 

The final integrals were performed using a quadratic fit to the probabilities as a function 
of bi before including the dnsch/db weighting (so that we interpolate a smoother function), 
with the interpolating function forced to be zero as bi — > 0. When we combine all these 
effects by integrating over the distribution of less massive sateUites, 



we find a significant effect. 

There is a net loss in the two-image cross section of Pe,2{bo) / Ptot{bo) — —0.11, a net gain 
in the four-image cross section of Pe,4:{bo) / Ptot{bo) — 0.26, and a small contribution from 
other multiphcities {Pe,3{bo) / Ptot{bo) = 0.03 and Pe,5{bo) / Ptot{bo) = 0.003). The total cross 
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section is slightly changed from that of the two potentials in isolation, with a net increase 
in the total probability of about 18%. This increase in probability was taken into account 
in our calculation of mentioned earlier. 

The effect on the quad fraction is much larger because we combine the effects of a net 
suppression in the n — 2 image cross section with a net gain in the n — 4 image cross section. 
If a typical elliptical model predicts that fraction / ~ 21% (for axis ratio 0.7) of lenses will 
be four image systems, the addition of satellites should change the quad fraction to 

{fPtot + Pe,4)/(Ptot + Pe,2 + ^,3 + ^,4 + Pefi) - 40%. (22) 

For doubles, we have 

{fPtot + Pe,2)/{Ptot + Pe,2 + Pe,3 + Pe,4 + Pefi) ^ 57%. (23) 

The satellites nearly double the expected quad fraction, thereby solving the quads-to-doubles 
ratio problem given the statistical uncertainties in the observed ratio and our estimate of 
n*. The model also predicts a 3% contribution from n = 3 and n = 5 lenses, corresponding 
to an expectation of one lens with a non-standard multiplicity in the JVAS/CLASS sample. 
In practice, the JVAS/CLASS survey found one non-standard lens, B1359-I-154, a 6 image 
system formed by a compound lens consisting of 3 lens galaxies. 

If we attempt to break down the excess cross section, we find that no single source 
dominates the result. For massive satellites, correlated, uncorrelated, nearby {R < 5bo) and 
distant satellites all make equal contributions. This is show in Figures 3. In addition, lenses 
due to the primary alone are also shown (the quad to doubles ratio of the lenses associated 
with the primary alone is 31:66). 

In our SIS+SIS model and the computation of the change in the quad fraction we 
model the number of two (four) image lenses produced by a more reahstic SIE+SIS model 
as the sum of the number expected for an SIE in isolation with the excess cross section. We 
compare this simple model to results for the more reahstic SIE+SIS models with sateUites on 
either the major or minor axis of the lens in Figure 5. The results with the SIE bracket the 
SIS models with an average quite similar to our simpler spherical model. If we conducted 
a full suite of models at all inclinations we would simply fill in the region between the 
major and minor axis limits. We can crudely understand why ignoring the ellipticity of the 
primary lens has little effect on the result by considering a lens with two independent external 
shears 70 and 71 representing the ellipticity of the primary lens and the shear induced by 
a sateUite respectively. The four-image cross section of the combined lens simply scales as 
7o + 7i + 27071 cos 2 AO-y where A^^ is the angle between the two shear axes. The angle 
averaged cross section, ~ 70 + 7^, is simply the sum of the four- image cross section of the 



b, ill uiiiLs of bn 



0,2 



0.4 0.6 
b, in units of b„ 



0.8 



Fig. 4. — Contributions to doubles (left) and quads (right) from various effects. The lowest 
(uppermost) solid line is the full contribution. The filled triangles are the correlated contri- 
bution and the open squares are the uncorrelated contribution. The dotted and dashed lines 
are the contributions from within a radius of 56o and outside this radius respectively, and 
the filled hexagons with the line going through them show the contribution from the images 
associated solely to the primary. For images associated solely with the primary, the net 
total number of systems is lower, so to allow easier comparison with the other numbers, this 
curve has been rescaled to give the same net number of systems, thus giving the fractional 
contribution of these systems. 
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Fig. 5. — Comparison of SIS (witli core) primaries and elliptical primaries. The fraction of 
doubles are shown on left, quads on right, plotted as a function of satellite mass. The solid 
hne is the fraction of doubles or quads in the SIS (with core) case used for the bulk of the 
calculations, where the fractions found from an SIS primary and satellites are added to the 
fractions for an elliptical primary in isolation. The dashed lines are for elliptical primary 
lenses either aligned parallel or perpendicular to the direction of the satellite. The dotted line 
is the average of the two elliptical orientations. As the overlapping caustics of the two lenses 
was expected to cause the most variation from the ansatz used in the paper, separations 
only up to 5 6o; five times the Einstein radius of the primary, were included; at and beyond 
this point the caustics of the two lenses are well separated. As can be seen in the plots the 
approximation used in the bulk of the calculations is quite close to the average of the two 
elliptical orientations for the quad case, for the double case the SIS calculation will actually 
tend to give an overestimate. 
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primary galaxy in isolation with that induced by the satellite on a spherical galaxy. Thus, 
while a complete calculation with elliptical primaries averaging over orientation is a logical 
next step, it should have little effect on our basic, quantitative conclusion. We also found 
that the non-circular models were 4-5 times more efficient at producing n — 5 image systems. 

True lenses will also have an external shear which can affect the quad to double ratio. 

We expect this shear to be random with respect to the elliptical axes of the primary and 
thus simulated random shears. Averaging over two shear orientations with ellipticity led to 
a net increase in the quad to double ratio, improving the effects of the satellite galaxies, just 
as would be expected from the argument above. A detailed study of the full parameter space 
would be interesting for future work. 



Our standard model was based on treating close satellites of the primary lens as in- 
dependent galaxies on larger scales. Using an empirical normalization for their abundance 
(n*) helps keep the results from changing drastically under changes of the distribution of 
the satellites in mass or radius. To see how robust these results are, we consider here alter- 
nate assumptions about the distribution of the satellites in mass, separation and or internal 
structure. 

We first consider changing from a Schechter distribution of satellites, motivated by 
galaxies, to a power law model motivated by CDM halo models. Simulations find halo mass 
functions that are power laws. 



(e.g. Moore et al. (1999), Klypin et al. (1999)). with a normahzation of = O.Ol/i^ Mpc~^ 
at — IO^'^H'^Mq for a standard "concordance" model with ag = 0.9 (Jenkins et al. (2001)). 
If we are using a model with potentially dark halos, we should avoid normalizations based 
on the numbers of visible satellites. We convert mass to luminosity to critical radii (with 
distance dependence) as in section §2. 

In the CDM picture, the number of low mass halos and galaxies rapidly diverges towards 
lower mass. NFW-like galaxy profiles (Navarro, Frenk & White (1996), Moore et al. (1999)) 
are expected when only gravitational forces are important, but these do not lens efficiently. 
Once baryonic cooling occurs, the halos contract to the very effectively lensing SIS profiles 
(e.g. Keeton (1998), Porciani & Madau (2000), Kochanek & White (2001), Li & Ostriker 
(2000; 2003)). In some sense, our Schechter function model for the satellites already is the 



3.1. 



Variations to Standard Model 
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correct model for the halos in which baryons have cooled. However, as an experiment, we also 
tried using the mass function of halos, cutting off the halos with such low circular velocities 
that they may have lost all their gas when the universe reionized (see Bullock, Kravtsov 
& Weinberg (2000)). We set this scale to bi = 0.056* (corresponding to t\, ~ 50 km/s), 
and then set the number of halos based on CDM simulations. Under these assumptions, we 
again find a significant suppression of the n = 2 configurations, with Pe,2/ Ptot = —0.46, a 
significant gain in the n = 4 configurations, with P^a/ Ptot = 1-12, and a small contribution 
from non-standard configurations with P^s/Ptot — 0.12 and Pe,b/Ptot — 0.05. For these 
excess cross sections we would expect quads to significantly outnumber doubles with a 73:18 
ratio, and again to have a small number of triples or quints. 

We used an extrapolation of the correlation function, measured on large scales, to model 
the projected density of satellites in the inner regions of galaxies. In practice, large mass 
satellites which orbit too close to the center of the primary lens will undergo rapid orbital 
decay and be destroyed (e.g. Metcalf & Madau (2001)). This can be modeled by setting 
a scale Rfaii ~ 20-30 kpc for the onset of the rapid decay, and then assuming there are 
no satellites interior to this radius. This leads to a modified correlation function ^2{R) 
which flattens in the center to a constant value (roughly S,2{R) ~ 10"^ /i^^ Mpc ) rather 
than continuing to rise as a power-law. However, since we must set the normalization to 
reproduce the observed numbers of satellites (finding n* = 0.027 ± 0.009/i^ Mpc~^ for the 
Rfaii — 22 kpc case we tried as an experiment), the change in the radial structure has little 
effect on the lens statistics. Assuming the Schechter model for the satellite mass function, 
we find the now familiar suppression of the doubles, Pe,2/ Ptot — —0.16, and enhancement 
of the quads, Pe,4/Ptot = 0.41, leading (with the inclusion of the triples and fives) to an 
expectation of a quad:double ratio of 49:49 assuming lenses with a typical axis ratio of 
/ = 0.7, considering lenses associated with the primary alone gives a corresponding ratio 
of 35:62. Using the CDM inspired mass function instead of the Schechter model produces 
again significantly more quads, 67%, compared to 30% doubles. 

As the correlation function at short distances is not well known, we also experimented 
with changing the logarithmic slope of ^{r) = (r/rg)"^ from 7 = 1.8. For each case we 
estimated from the observed numbers of satellites. Steeper correlation functions weaken 
the effect of satellites. However, a steeper correlation function can also increase the number 
of quads significantly, if it is coupled with a restriction that no satellites occur within a fixed 
radius (such as 22 kpc). The cause of these correlations is easily understood from Fig. 1. 
For a fixed number of sateUites, a steeper correlation function simultaneously adds weight to 
the central peak and reduces the weight of the resonance region, both of which will enhance 
the probability of producing two-image lenses. A shallower correlation function does the 
reverse, thereby enhancing the probability of producing four-image lenses. Better statistics 
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for the distribution of lens galaxy satellites for different image morphologies may be able to 
discriminate between these models. 

For our final experiment we truncated the halos of the two lenses by using pseudo-Jaffe 
models (Jaffe (1983), Keeton (2001)) with p oc h{r'^ + s^)-'^{r'^ + a?)-"^ and a = 106 rather 
than SIS models for the two lens components. Normahzed by the total cross sections for 
the isolated pseudo-Jaffe models, the suppression of the doubles, Pe,2/Ptot = —0-11, and the 
enhancement of the quads, Pe,i/ Ptot = 0.17, is somewhat less than for our standard SIS+SIS 
models, raising the quad fraction from the 21% expected for / = 0.7 ellipsoids to 35% rather 
than to 40%. 



4. Conclusions 



The high fraction of four-image lenses (50% of JVAS/CLASS lenses have four or more 
images) compared to the expectations for simple ellipsoidal lenses in reasonable tidal fields 
(20% to 30%) has been a long standing puzzle in understanding the statistics of gravitational 
lenses (King & Browne (1996), Kochanek (1996b), Kccton, Kochanek & Seljak (1997), Finch 
et al. (2002), Rusin & Tegmark (2001)). We find that the problem is largely explained by 
the changes in the caustic structures produced by including the effects of nearby sateUite 
galaxies when we normalize the abundance of these satellites to match the abundance of 
satellites in the JVAS/CLASS lens sample. The sateUites produce relatively small changes 
in the total lensing probabilities, but systematically suppress the probability of obtaining a 
two-image lens in favor of finding a four-image lens, with a small probability of finding non- 
standard 3 or 5 image lenses. Depending on the parameters and the assumptions, satellites 
can roughly double the 21% quad fraction expected for a typical elliptical galaxy (axis ratio 
0.7) up to a 35-50% quad fraction that is relatively easy to reconcile with the data. Provided 
we normahze the satellite abundance to the observations, this conclusion is little affected by 
the details of what we assume for the structure or spatial distribution of the satellites. A 
steeply rising mass function will have a notable effect, however, it is possible that the lowest 
mass satellites will be affected more strongly by tidal stripping and be less well approximated 
by the Faber-Jackson relation we assumed initially. It would be interesting to pursue this 
in a future work. The typical satellites responsible for the changes should have critical 
radii (luminosities) roughly 50% (25%) that of the primary lens, again consistent with the 
observations where the satellites in the JVAS/CLASS sample have luminosity ratios of 0.08, 
0.09, 0.26, 0.40, 0.40, 0.45 relative to the primary lenses. The surveys should also find 
that ~3% of the lenses should have non-standard multiplicities, which is borne out by the 
discovery of two higher multiplicity lenses (B1359-I-154, a 6 image lens produced by 3 galaxies 
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Fig. 6. — Biases in image separation estimates due to satellites. We show the average image 
radius (dashed line) and one-half the maximum image separation (solid line) in units of the 
critical radius of the primary lens 60 as a function of the satellite critical radius hi. We show 
the averages for regions with R < 2bo and 46o-The dotted line shows the image critical radius 
bo + hi which we would find if two SIS lenses are exactly aligned (i? = 0). 
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in the CLASS survey, Myers et al. (1999); and PMNJ0134-0931, a 5 image lens produced by 
2 galaxies in the PMN survey, Gregg et al. (2002), Winn et al. (2002)) in the flat-spectrum 
radio lens surveys. 

The satellite galaxies can also interfere with efforts to estimate the cosmological model 
using lens statistics (e.g. Chae et al. (2002), Davis, Huterer & Krauss (2003) and refer- 
ences therein), as it is crucial in such studies to match the observed distributions of image 
separations in order to correctly estimate the lens cross sections (see Kochanek (1996a)). 
Close satellites modify the image separations, leading to biased estimates of the average 
multiple-imaging cross sections that in turn bias estimates of the cosmological model. Sim- 
ply dropping lenses with satelhtes is not a solution because it builds a bias against high 
mass lenses into the calculation. The changes in the cross section produced by the sateUite 
depend on the ratio r = h/bo, while the luminosity of the sateUite depends on — r'^b^. 
Thus, even though the satelhtes create the same fractional perturbation for all lenses, the 
satellites of the higher mass lenses are more easily detected simply because they are more 
luminous. For example, suppose every lens of luminosity Lq had a satellite of luminosity 
Li = Lo/A, and that we could detect any satellite with Li > L*/4. Every lens with Lq > 
would be rejected from the sample because it has a detectable satellite. The smaller average 
image separations of the final sample (only 68% that of the original sample) would lead to 
an underestimate of the average velocity dispersion (by about 20%) and cross section (by a 
factor of two) of the lenses. But the bias on the cross sections produced by the existence of 
satellites has not changedlAW the lenses remaining in the sample have satellites producing 
the same fractional perturbations to the cross sections as they do in the rejected, massive 
systems. This is an extreme example, but the resulting bias in more realistic cases still 
represents a serious problem, and may explain the very low velocity dispersion scales found 
by Chae et al. ((2002)) and Davis et al. ((2003)). 

Inclusion of the satellites does bias the image separations upwards, however, as shown 
in Fig. 6. We computed the average distance of the lensed images from the primary lens or 
one-half the maximum image separations, standard estimators for the average critical radius 
of a lens, as a function of the critical radius of the satellite. We included the weighting of the 
radial distribution of the satellites by the correlation function and computed the averages for 
the regions with R < 2bo and Abo- The amplitude of the bias is modest except for systems 
with comparable masses. In particular, note that the average effect is significantly smaller 
than simply adding the critical radii of the two lenses. If we continue our (extreme) thought 
experiment of a sample of lenses all having satellites with bi/bo — l/2,we would overestimate 
the average image separations of an isolated lens by approximately 20% (using the R < 26o 
region, for a more detailed calculation one would choose this region self-consistently) if we 
simply used the observed image separations. For this case we would get an overestimate 
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of the velocity dispersion (by about 10%) and the cross section (by 40%). However, the 
magnitude of the bias from simply including all the systems with satellites is only half that 
from excluding them. 

In practice, the magnitude of the effect and the ambiguities arising from any simple 
treatment mean that reliable estimates of cosmological parameters or galaxy mass scales 
from lensing statistics cannot be based on isolated lenses. The calculations must include the 
effects of satellites. Fortunately, the model for the satellites, particularly the abundance of 
satellites, can be calibrated directly from the observations. As part of this process, more 
careful observational surveys need to be made of lens galaxy environments than the crude 
normalization estimates we made in §2.2. Alternatively the models can be checked against 
numerical simulations of lens environments (e.g. White et al. (2001), Holder & Schechter 
(2003), Chen, Kravtsov & Keeton (2003)). Pull calculations with eUiptical lenses, large scale 
tidal shear fields and satellites should be done. We also limited our calculation to sateUites, 
but more attention should be given to the statistical effects of embedding the lenses in more 
massive group or cluster halos. As the lens sample continues to grow, so does our ability to 
understand and quantitatively model the effects of satellites, so the ambiguities in statistical 
studies of lenses introduced by satellites will be resolved. 
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